Instructions Per Joule
   HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, e ...
, performance per watt is a measure of the
energy efficiency Energy efficiency may refer to: * Energy efficiency (physics), the ratio between the useful output and input of an energy conversion process ** Electrical efficiency, useful power output per electrical power consumed ** Mechanical efficiency, a ra ...
of a particular
computer architecture In computer engineering, computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementation. At a more detailed level, t ...
or
computer hardware Computer hardware includes the physical parts of a computer, such as the computer case, case, central processing unit (CPU), Random-access memory, random access memory (RAM), Computer monitor, monitor, Computer mouse, mouse, Computer keyboard, ...
. Literally, it measures the rate of computation that can be delivered by a computer for every
watt The watt (symbol: W) is the unit of power or radiant flux in the International System of Units (SI), equal to 1 joule per second or 1 kg⋅m2⋅s−3. It is used to quantify the rate of energy transfer. The watt is named after James Wa ...
of power consumed. This rate is typically measured by performance on the LINPACK benchmark when trying to compare between computing systems: an example using this is the
Green500 The Green500 is a biannual ranking of supercomputers, from the TOP500 list of supercomputers, in terms of energy efficiency. The list measures performance per watt using the TOP500 measure of high performance LINPACK benchmarks at double-precisi ...
list of supercomputers. Performance per watt has been suggested to be a more sustainable measure of computing than
Moore’s Law Moore's law is the observation that the number of transistors in a dense integrated circuit (IC) doubles about every two years. Moore's law is an observation and projection of a historical trend. Rather than a law of physics, it is an empiri ...
. System designers building
parallel computers Parallel computing is a type of computing, computation in which many calculations or Process (computing), processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. ...
, such as Google's hardware, pick CPUs based on their performance per watt of power, because the cost of powering the CPU outweighs the cost of the CPU itself. Spaceflight computers have hard limits on the maximum power available and also have hard requirements on minimum real-time performance. A ratio of processing speed to required electrical power is more useful than raw processing speed. D. J. Shirley; and M. K. McLelland
"The Next-Generation SC-7 RISC Spaceflight Computer"
p. 1, 2.


Definition

The performance and power consumption metrics used depend on the definition; reasonable measures of performance are
FLOPS In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate meas ...
, MIPS, or the score for any performance benchmark. Several measures of power usage may be employed, depending on the purposes of the metric; for example, a metric might only consider the electrical power delivered to a machine directly, while another might include all power necessary to run a computer, such as cooling and monitoring systems. The power measurement is often the average power used while running the benchmark, but other measures of power usage may be employed (e.g. peak power, idle power). For example, the early
UNIVAC I The UNIVAC I (Universal Automatic Computer I) was the first general-purpose electronic digital computer design for business application produced in the United States. It was designed principally by J. Presper Eckert and John Mauchly, the invento ...
computer performed approximately 0.015 operations per watt-second (performing 1,905 operations per second (OPS), while consuming 125 kW). The
Fujitsu is a Japanese multinational information and communications technology equipment and services corporation, established in 1935 and headquartered in Tokyo. Fujitsu is the world's sixth-largest IT services provider by annual revenue, and the la ...
FR-V The Fujitsu FR-V (Fujitsu RISC-VLIW) is one of the very few processors ever able to process both a very long instruction word (VLIW) and vector processor instructions at the same time, increasing throughput with high parallel computing while increa ...
VLIW Very long instruction word (VLIW) refers to instruction set architectures designed to exploit instruction level parallelism (ILP). Whereas conventional central processing units (CPU, processor) mostly allow programs to specify instructions to exe ...
/
vector processor In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...
system on a chip A system on a chip or system-on-chip (SoC ; pl. ''SoCs'' ) is an integrated circuit that integrates most or all components of a computer or other electronic system. These components almost always include a central processing unit (CPU), memory ...
in the 4 FR550 core variant released 2005 performs 51 Giga-OPS with 3 watts of power consumption resulting in 17 billion operations per watt-second. This is an improvement by over a trillion times in 54 years. Most of the power a computer uses is converted into heat, so a system that takes fewer watts to do a job will require less cooling to maintain a given
operating temperature An operating temperature is the allowable temperature range of the local ambient environment at which an electrical or mechanical device operates. The device will operate effectively within a specified temperature range which varies based on the de ...
. Reduced cooling demands makes it easier to quiet a computer. Lower energy consumption can also make it less costly to run, and reduce the environmental impact of powering the computer (see
green computing Green computing, green IT, or ICT sustainability, is the study and practice of environmentally sustainable computing or IT. The goals of green computing are similar to green chemistry: reduce the use of hazardous materials, maximize energy effici ...
). If installed where there is limited
climate control Heating, ventilation, and air conditioning (HVAC) is the use of various technologies to control the temperature, humidity, and purity of the air in an enclosed space. Its goal is to provide thermal comfort and acceptable indoor air quality. HV ...
, a lower power computer will operate at a lower temperature, which may make it more reliable. In a climate controlled environment, reductions in direct power use may also create savings in climate control energy. Computing energy consumption is sometimes also measured by reporting the energy required to run a particular benchmark, for instance
EEMBC EEMBC, the Embedded Microprocessor Benchmark Consortium, is a non-profit, member-funded organization formed in 1997, focused on the creation of standard benchmarks for the hardware and software used in embedded systems. The goal of its members is ...
EnergyBench. Energy consumption figures for a standard workload may make it easier to judge the effect of an improvement in
energy efficiency Energy efficiency may refer to: * Energy efficiency (physics), the ratio between the useful output and input of an energy conversion process ** Electrical efficiency, useful power output per electrical power consumed ** Mechanical efficiency, a ra ...
. Performance (in operations/second) per watt can also be written as operations/watt-second, or operations/joule, since 1 watt = 1 joule/second.


FLOPS per watt

FLOPS per watt is a common measure. Like the
FLOPS In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate meas ...
(
Floating Point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be ...
Operations Per Second) metric it is based on, the metric is usually applied to
scientific computing Computational science, also known as scientific computing or scientific computation (SC), is a field in mathematics that uses advanced computing capabilities to understand and solve complex problems. It is an area of science that spans many disc ...
and simulations involving many
floating point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be ...
calculations.


Examples

, the
Green500 The Green500 is a biannual ranking of supercomputers, from the TOP500 list of supercomputers, in terms of energy efficiency. The list measures performance per watt using the TOP500 measure of high performance LINPACK benchmarks at double-precisi ...
list rates the two most efficient supercomputers highest those are both based on the same
manycore Manycore processors are special kinds of multi-core processors designed for a high degree of parallel processing, containing numerous simpler, independent processor cores (from a few tens of cores to thousands or more). Manycore processors are use ...
accelerator
PEZY-SCnp PEZY Computing is a Japanese fabless computer chip design company specialising in the design of manycore processors for supercomputers. History PEZY Computing was founded in 2010. The company's first manycore processor the PEZY-1 was launched ...
Japanese technology in addition to Intel Xeon processors both at RIKEN, the top one at 6673.8 MFLOPS/watt; and the third ranked is the Chinese-technology Sunway TaihuLight (a much bigger machine, that is the ranked 2nd on
TOP500 The TOP500 project ranks and details the 500 most powerful non-distributed computing, distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these ...
, the others are not on that list) at 6051.3 MFLOPS/watt. In June 2012, the Green500 list rated BlueGene/Q, Power BQC 16C as the most efficient supercomputer on the TOP500 in terms of FLOPS per watt, running at 2,100.88 MFLOPS/watt. In November 2010, IBM machine,
Blue Gene/Q Blue Gene is an IBM project aimed at designing supercomputers that can reach operating speeds in the petaFLOPS (PFLOPS) range, with low power consumption. The project created three generations of supercomputers, Blue Gene/L, Blue Gene/P, ...
achieves 1,684 MFLOPS/watt. On 9 June 2008, CNN reported that IBM's Roadrunner supercomputer achieves 376 MFLOPS/watt. As part of
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
's Tera-Scale research project, the team produced an 80-core CPU that can achieve over 16,000 MFLOPS/watt. The future of that CPU is not certain. Microwulf, a low cost desktop
Beowulf cluster A Beowulf cluster is a computer cluster of what are normally identical, commodity-grade computers networked into a small local area network with libraries and programs installed which allow processing to be shared among them. The result is a hig ...
of four dual-core
Athlon 64 X2 The Athlon 64 X2 is the first native dual-core desktop central processing unit (CPU) designed by Advanced Micro Devices (AMD). It was designed from scratch as native dual-core by using an already multi-CPU enabled Athlon 64, joining it with ano ...
3800+ computers, runs at 58 MFLOPS/watt. Kalray has developed a 256-core VLIW CPU that achieves 25,000 MFLOPS/watt. Next generation is expected to achieve 75,000 MFLOPS/watt. However, in 2019 their latest chip for embedded is 80-core and claims up to 4 TFLOPS at 20 W.
Adapteva Zero ASIC Corporation, formerly Adapteva, Inc., is a fabless semiconductor company focusing on low power many core microprocessor design. The company was the second company to announce a design with 1,000 specialized processing cores on a single ...
announced the Epiphany V, a 1024-core 64-bit RISC processor intended to achieve 75 GFLOPS/watt, while they later announced that the Epiphany V was "unlikely" to become available as a commercial product US Paten
10,020,436
July 2018 claims three intervals of 100, 300, and 600 GFLOPS/watt.


GPU efficiency

Graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...
s (GPU) have continued to increase in energy usage, while CPUs designers have recently focused on improving performance per watt. High performance GPUs may draw large amount of power, therefore intelligent techniques are required to manage GPU power consumption. Measures like 3DMark2006 score per watt can help identify more efficient GPUs. However that may not adequately incorporate efficiency in typical use, where much time is spent doing less demanding tasks. With modern GPUs, energy usage is an important constraint on the maximum computational capabilities that can be achieved. GPU designs are usually highly scalable, allowing the manufacturer to put multiple chips on the same video card, or to use multiple video cards that work in parallel. Peak performance of any system is essentially limited by the amount of power it can draw and the amount of heat it can dissipate. Consequently, performance per watt of a GPU design translates directly into peak performance of a system that uses that design. Since GPUs may also be used for some general purpose computation, sometimes their performance is measured in terms also applied to CPUs, such as FLOPS per watt.


Challenges

While performance per watt is useful, absolute power requirements are also important. Claims of improved performance per watt may be used to mask increasing power demands. For instance, though newer generation GPU architectures may provide better performance per watt, continued performance increases can negate the gains in efficiency, and the GPUs continue to consume large amounts of power. Benchmarks that measure power under heavy load may not adequately reflect typical efficiency. For instance, 3DMark stresses the 3D performance of a GPU, but many computers spend most of their time doing less intense display tasks (idle, 2D tasks, displaying video). So the 2D or idle efficiency of the graphics system may be at least as significant for overall energy efficiency. Likewise, systems that spend much of their time in standby or soft off are not adequately characterized by just efficiency under load. To help address this some benchmarks, like
SPECpower {{Short description, Type of computer benchmarking tool SPECpower_ssj2008 is the first industry-standard benchmark that evaluates the power and performance characteristics of volume server class computers. It is available from the Standard Performa ...
, include measurements at a series of load levels. The efficiency of some electrical components, such as
voltage regulator A voltage regulator is a system designed to automatically maintain a constant voltage. A voltage regulator may use a simple feed-forward design or may include negative feedback. It may use an electromechanical mechanism, or electronic components ...
s, decreases with increasing temperature, so the power used may increase with temperature. Power supplies, motherboards, and some video cards are some of the subsystems affected by this. So their power draw may depend on temperature, and the temperature or temperature dependence should be noted when measuring. Performance per watt also typically does not include full life-cycle costs. Since computer manufacturing is energy intensive, and computers often have a relatively short lifespan, energy and materials involved in production, distribution, disposal and
recycling Recycling is the process of converting waste materials into new materials and objects. The recovery of energy from waste materials is often included in this concept. The recyclability of a material depends on its ability to reacquire the p ...
often make up significant portions of their cost, energy use, and environmental impact. Energy required for climate control of the computer's surroundings is often not counted in the wattage calculation, but it can be significant.


Other energy efficiency measures

SWaP (space, wattage and performance) is a
Sun Microsystems Sun Microsystems, Inc. (Sun for short) was an American technology company that sold computers, computer components, software, and information technology services and created the Java programming language, the Solaris operating system, ZFS, the ...
metric for
data center A data center (American English) or data centre (British English)See spelling differences. is a building, a dedicated space within a building, or a group of buildings used to house computer systems and associated components, such as telecommunic ...
s, incorporating power and space: :\mathrm = \frac Where performance is measured by any appropriate benchmark, and space is size of the computer. Reduction of power, mass, and volume is also important for spaceflight computers.


See also

; Energy efficiency benchmarks *
Average CPU power The thermal design power (TDP), sometimes called thermal design point, is the maximum amount of heat generated by a computer chip or component (often a CPU, GPU or system on a chip) that the cooling system in a computer is designed to dissipat ...
(ACP) a measure of power consumption when running several standard benchmarks *
EEMBC EEMBC, the Embedded Microprocessor Benchmark Consortium, is a non-profit, member-funded organization formed in 1997, focused on the creation of standard benchmarks for the hardware and software used in embedded systems. The goal of its members is ...
EnergyBench *
SPECpower {{Short description, Type of computer benchmarking tool SPECpower_ssj2008 is the first industry-standard benchmark that evaluates the power and performance characteristics of volume server class computers. It is available from the Standard Performa ...
a benchmark for web servers running Java (Server Side Java Operations per Joule) ; Other *
Data center infrastructure efficiency Data center infrastructure efficiency (DCIE), is a performance improvement metric used to calculate the energy efficiency of a data center. DCIE is the percentage value derived, by dividing information technology equipment power by total facility po ...
(DCIE) *
Energy proportional computing In computing, energy proportionality is a measure of the relationship between power consumed in a Computer, computer system, and the rate at which useful work is done (its utilization, which is one measure of Computer performance, performance). If t ...
*
GeForce 9 series The GeForce 9 series is the ninth generation of Nvidia's GeForce series of graphics processing units, the first of which was released on February 21, 2008. Products are based on a slightly repolished Tesla microarchitecture, adding PCIe 2. ...
for GPU list, with energy use and theoretical FLOPS *
IT energy management IT energy management or Green IT is the analysis and management of energy demand within the Information Technology department in any organization. IT energy demand accounts for approximately 2% of global emissions, approximately the same level a ...
* Koomey's law *
Landauer's principle Landauer's principle is a physical principle pertaining to the lower theoretical limit of energy consumption of computation. It holds that "any logically irreversible manipulation of information, such as the erasure of a bit or the merging of two ...
*
Low-power electronics Low-power electronics are electronics, such as notebook processors, that have been designed to use less electric power than usual, often at some expense. In the case of notebook processors, this expense is processing power; notebook processors usu ...
*
Power usage effectiveness Power usage effectiveness (PUE) is a ratio that describes how efficiently a computer data center uses energy; specifically, how much energy is used by the computing equipment (in contrast to cooling and other overhead that supports the equipment). ...
(PUE) *
Processor power dissipation Processor power dissipation or processing unit power dissipation is the process in which computer processors consume electrical energy, and dissipate this energy in the form of heat due to the resistance in the electronic circuits. Power managem ...


Notes and references


Further reading

* * *


External links


The Green500
{{Graphics Processing Unit Benchmarks (computing) Computers and the environment Electric power Energy conservation Computer performance